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Abstract 



We describe an algorithm that, given any full-rank matrix A having fewer rows 
than columns, can rapidly compute the orthogonal projection of any vector onto the 
null space of A, as well as the orthogonal projection onto the row space of A, provided 
that both A and its adjoint A* can be applied rapidly to arbitrary vectors. As an 
intermediate step, the algorithm solves the overdetermined linear least-squares regres- 
sion involving A* (and so can be used for this, too). The basis of the algorithm is an 
obvious but numerically unstable scheme; suitable use of a preconditioner yields nu- 
merical stability. We generate the preconditioner rapidly via a randomized procedure 
that succeeds with extremely high probability. In many circumstances, the method can 
accelerate interior-point methods for convex optimization, such as linear programming 
(Ming Gu, personal communication). 

1 Introduction 

This article introduces an algorithm that, given any full-rank matrix A having fewer rows 
than columns, such that both A and its adjoint A* can be applied rapidly to arbitrary 
vectors, can rapidly compute 

1. the orthogonal projection of any vector b onto the null space of A, 

2. the orthogonal projection of b onto the row space of A, or 

3. the vector x minimizing the Euclidean norm \\A* x — b\\ of the difference between A* x 
and the given vector b (thus solving the overdetermined linear least-squares regression 



For simplicity, we focus on projecting onto the null space, describing extensions in Remark 4.1. 

The basis for our algorithm is the well-known formula for the orthogonal projection Zb 
of a vector b onto the null space of a full-rank matrix A having fewer rows than columns: 



It is trivial to verify that Z is the orthogonal projection onto the null space of A, by checking 
that ZZ = Z (so that Z is a projection), that Z* = Z (so that Z is self-adjoint, making 
the projection "orthogonal"), and that AZ = (so that Zb is in the null space of A for any 



A* x « b). 
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vector b). Since we are assuming that A and A* can be applied rapidly to arbitrary vectors, 
(1) immediately yields a fast algorithm for computing Zb. Indeed, we may apply A to each 
column of A* rapidly, obtaining the smaller square matrix A A*; we can then apply A to 
b, solve (AA*)x = Ab for x, apply A* to x, and subtract the result from b, obtaining Zb. 
However, using (1) directly is numerically unstable; using (1) directly is analogous to using 
the normal equations for computing the solution to a linear least-squares regression (see, for 
example, [7]). 

We replace (1) with a formula that is equivalent in exact arithmetic (but not in floating- 
point), namely 

Zb = b-A* (P*)- 1 (P- 1 A A* (P*)- 1 )' 1 P- 1 Ab, (2) 

where P is a matrix such that P" 1 A is well-conditioned. If A is very short and fat, then P is 
small and square, and so we can apply P~ l and (P*) -1 reasonably fast to arbitrary vectors. 
Thus, given a matrix P such that P _1 A is well-conditioned, (2) yields a numerically stable 
fast algorithm for computing Zb. We can rapidly produce such a matrix P via the randomized 
method of [10], which is based on the techniques of [5] and its predecessors. Although 
the method of the present paper is randomized, the probability of numerical instability is 
negligible (see Remark 4.2 in Section 4 below). A preconditioned iterative approach similar 
to that in [10] is also feasible. Related work includes [1] and [8]. 

The remaining sections cover the following: Section 2 summarizes relevant facts from 
prior publications. Section 3 details the mathematical basis for the algorithm. Section 4 
describes the algorithm of the present paper in detail, and estimates its computational costs. 
Section 5 reports the results of several numerical experiments. 

In the present paper, the entries of all matrices are real-valued, though the algorithm 
and analysis extend trivially to matrices whose entries are complex- valued. We use the term 
"condition number" to refer to the I 2 condition number, the greatest singular value divided 
by the least (the least is the m th greatest singular value, for an m x n matrix with m < n). 

2 Preliminaries 

In this section, we summarize several facts about the spectra of random matrices and about 
techniques for preconditioning. 

2.1 Singular values of random matrices 

The following lemma provides a highly probable upper bound on the greatest singular value 
of a matrix whose entries are independent, identically distributed (i.i.d.) Gaussian random 
variables of zero mean and unit variance; Formula 8.8 in [6] provides an equivalent formula- 
tion of the lemma. 

Lemma 2.1. Suppose that I and m are positive integers with I > m. Suppose further that 
H is an m x I matrix whose entries are i.i.d. Gaussian random variables of zero mean and 
unit variance, and a is a positive real number, such that a > 1 and 



is nonnegative. 

Then, the greatest singular value of H is no greater than \plla with probability no less 
than 7r + defined in (3). 

The following lemma provides a highly probable lower bound on the least singular value 
of a matrix whose entries are i.i.d. Gaussian random variables of zero mean and unit vari- 
ance; Formula 2.5 in [3] and the proof of Lemma 4.1 in [3] together provide an equivalent 
formulation of Lemma 2.2. 

Lemma 2.2. Suppose that I and m are positive integers with I > m. Suppose further that 
H is an m x I matrix whose entries are i.i.d. Gaussian random variables of zero mean and 
unit variance, and (3 is a positive real number, such that 

j / _ \ l—m+l 



7T_ = 1 — (4) 

y/2ir (l-m + 1) \(l-m + l)Pj 

is nonnegative. 

Then, the least (that is, the m th greatest) singular value of H is no less than (3) 
with probability no less than 7r_ defined in (4). 

The following lemma provides a highly probable upper bound on the condition number of 
a matrix whose entries are i.i.d. Gaussian random variables of zero mean and unit variance; 
the lemma follows immediately from Lemmas 2.1 and 2.2. For simpler bounds, see [3]. 

Lemma 2.3. Suppose that I and m are positive integers with I > m. Suppose further that 
H is an m x I matrix whose entries are i.i.d. Gaussian random variables of zero mean and 
unit variance, and a and f3 are positive real numbers, such that a > 1 and 

2a n ' 1 ' x l - m+1 



710 1 4(a 2 - l)vGd^ U a2 "V y/2n(l-m + l) \(l - m + 1) (3 ) (5) 
is nonnegative. 

Then, the condition number of H is no greater than \f2laf3 with probability no less than 
7T defined in (5). 



2.2 Preconditioning 

The following lemma, proven in a slightly different form as Theorem 1 in [10], states that, 
given a short, fat m x n matrix A, the condition number of a certain preconditioned version 
of A corresponding to an n x I matrix G is equal to the condition number of V* G, where V 
is an n x m matrix of orthonormal right singular vectors of A. 

Lemma 2.4. Suppose that I, m, and n are positive integers such that m < I < n. Suppose 
further that A is a full-rank mxn matrix, and that the SVD of A is 

A m xn = U m xm^mxm {V n xm) ? (6) 
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where the columns of U are orthonormal, the columns of V are orthonormal, and E is a 
diagonal matrix whose entries are all nonnegative. Suppose in addition that G is an n x I 
matrix such that the m x I matrix A G has full rank. 

Then, there exist an m x m matrix P, and an I x m matrix Q whose columns are or- 
thonormal, such that 

A m xnG nx i = P m xm (Qlxm) ■ (7) 

Furthermore, the condition numbers of P~ x A and V* G are equal, for any mxm ma- 
trix P, and I x m matrix Q whose columns are orthonormal, such that P and Q satisfy (7). 



3 Mathematical apparatus 

In this section, we describe a randomized method for preconditioning rectangular matrices 
and prove that it succeeds with very high probability. 

The following theorem states that a certain preconditioned version P 1 A of a short, 
fat matrix A is well-conditioned with very high probability (see Observation 3.2 for explicit 
numerical bounds). 

Theorem 3.1. Suppose that I, m, and n are positive integers such that m < I < n. Suppose 
further that A is a full-rank mxn matrix, and that the SVD of A is 

A m xn = U mX m ^mxm (Kxm) j (8) 

where the columns of U are orthonormal, the columns of V are orthonormal, and E is a 
diagonal matrix whose entries are all nonnegative. Suppose in addition that G is an n x I 
matrix whose entries are i.i.d. Gaussian random variables of zero mean and unit variance. 
Suppose finally that a and f3 are positive real numbers, such that a > 1 and tt defined in (5) 
is nonnegative. 

Then, there exist an m x m matrix P , and an I x m matrix Q whose columns are or- 
thonormal, such that 

A m xn G n xl Pmxm (Qlxm) ■ (9) 

Furthermore, the condition number of P^ 1 A is no greater than \p2laj3 with probability 
no less than 7r defined in (5), for any mxm matrix P , and I x m matrix Q whose columns 
are orthonormal, such that P and Q satisfy (9). 

Proof. Combining the facts that the columns of V are orthonormal and that the entries of 
G are i.i.d. Gaussian random variables of zero mean and unit variance yields that the entries 
of Hmxi = (Vnxm)* G nx i ar e also i.i.d. Gaussian random variables of zero mean and unit 
variance. Combining this latter fact and Lemmas 2.3 and 2.4 completes the proof. □ 

Observation 3.2. To elucidate the bounds given in Theorem 3.1, we look at some special 
cases. With m > 2 and a > 2, we find that n from Theorem 3.1, defined in (5), satisfies 



7T > 1- 



2a 2 \ l ~ m+2 



4 (a 2 - 1) ^/n(l-m + 2)a 2 \e a - 1 1 

i / \ l— m+l 

. (10) 



v/27r(/-m + l) \(l-m + l)(3 
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Combining (10) and Theorem 3.1 yields strong bounds on the condition number of P^ 1 A 
when I — m = 4 and m > 2: With / — m = 4, a 2 = 4, and (3 = 3, the condition number is 
at most 10/ with probability at least 1 — 10^ 4 . With / — m = 4, a 2 = 7, and (3 = 26, the 
condition number is at most 100/ with probability at least 1 — 10~ 9 . With / — m = 4, a 2 = 9, 
and (3 = 250, the condition number is at most 1100/ with probability at least 1 — 10~ 14 . 
Moreover, if instead of / — m = 4 we take / to be a few times m, then the condition number 
of P~ l A is no greater than a small constant (that does not depend on /), with very high 
probability (see [3]). 



4 Description of the algorithm 

In this section, we describe the algorithm of the present paper in some detail. We tabulate 
its computational costs in Subsection 4.1. 

Suppose that m and n are positive integers with m < n, and A is a full-rank mxn matrix. 
The orthogonal projection of a vector b nx i onto the row space of A mxn is A* (A A*)^ 1 Ab. 
Therefore, the orthogonal projection of b nx i onto the null space of A mxn is b— A* (A A*)^ 1 A b. 
After constructing a matrix P mxm such that the condition number of P^ 1 A is not too large, 
we may compute the orthogonal projection onto the null space of A via the identity 

b- A* {A A*)" 1 Ab = b- A* (P*)- 1 (P- 1 AA* (P*)- 1 )' 1 P' 1 Ab. (11) 

With such a matrix P mxm , (11) provides a numerically stable means for computing the 
orthogonal projection of b onto the null space of A. 

To construct a matrix P mxm such that the condition number of A is not too large, 
we choose a positive integer / such that m < I < n (/ = m + 4 is often a good choice) and 
perform the following two steps: 

1. Construct the m x / matrix 

Srnxl = A rnxn G nx [ (12) 

one column at a time, where G is a matrix whose entries are i.i.d. random variables. 
Specifically, generate each column of G in succession, applying A to the column (and 
saving the result) before generating the next column. (Notice that we do not need to 
store all nl entries of G simultaneously.) 

2. Construct a pivoted QR decomposition of S*, 

{S m xl) = Qlxm Rmxm^mxmi (13) 

where the columns of Q are orthonormal, R is upper-triangular, and II is a permutation 
matrix. (See, for example, Chapter 5 of [7] for details on computing the pivoted QR 
decomposition.) 

With P = II* R*, Theorem 3.1 and Observation 3.2 show that the condition number of P^ 1 A 
is not too large, with very high probability. 
To construct the matrix 

Y = (P- 1 AA* (P*)- 1 )' 1 (14) 
which appears in the right-hand side of (11), we perform the following three steps: 
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1. Construct the mxm matrix 

p- — n* r~ 1 ci z>) 

A mxm ^^mxm ±L mxm \ ±U J 

(recalling that II is a permutation matrix). 

2. For each k — 1, 2, . . . , m — 1, m, construct the A; th column of an m x m matrix X 
via the following five steps: 

a. Extract the /c th column c^ xl of P~. 

b. Construct djfxi = (A mxn )* cJJ; xl . 

c. Construct e^ xl = ^4 m xn<^i x i- 

d. Construct f^xi = n mX m^xi (recalling that II is a permutation matrix). 

e. Solve R m xm x< mxi = fmxi f° r x< mxi (recalling that R is an upper-triangular ma- 
trix) . 

Notice that X = P' 1 A A* (P*) _1 , where P = (RU)* and (P*)" 1 = P~. Also, because 
we generate X one column at a time, we do not need to store an extra 0(nm) floating- 
point words of memory at any stage of the algorithm. 

3. Construct the mxm matrix 

Y rn xm -^mxm- (16) 

It follows from (11) that the orthogonal projection of b onto the null space of A is 

b - A* (A A*)' 1 A b = b - A* (P*)- 1 Y P- 1 A b. (17) 

In order to apply A* (P*) _1 Y P^ 1 A to a vector 6 nx i, where P = IT* R*, we perform the 
following seven steps: 

1. Construct c mx i = A mxn b nxl . 

2. Construct d mx i = n mxm c mx i (recalling that II is a permutation matrix). 

3. Solve R* mxm e mx i = d rnx i for e mx \ (recalling that R is an upper-triangular matrix). 

4. Construct f mxl = Y mxm e mxl . 

5. Solve R mxm g mx i = fmxi for g m xi (recalling that R is an upper-triangular matrix). 

6. Construct h mx \ = Il^ nxm g mx i (recalling that II is a permutation matrix). 

7. Construct b nxl = (A mxn )* h mxl . 

The output b = A* (P*) _1 Y P^ 1 A b of the above procedure is the orthogonal projection 
of b onto the row space of A. The orthogonal projection of b onto the null space of A is then 
b-b. 
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Remark 4.1. The vector h mx \ constructed in Step 6 above minimizes the Euclidean norm 
H(Anxn)* h mxl -b nxl \\ of (A mxn )* h mxl -b nxl ] that is, h mxl solves the overdetermined linear 
least-squares regression (A mxn )* h mxl ^ b nxl . 

Remark 4.2. The algorithm of the present section is numerically stable provided that the 
condition number of P^ 1 A is not too large. Theorem 3.1 and Observation 3.2 show that the 
condition number is not too large, with very high probability. 

4.1 Costs 

In this subsection, we estimate the number of floating-point operations required by the 
algorithm of the present section. 

We denote by C A the cost of applying A mxn to an n x 1 vector; we denote by Ca* the 
cost of applying (A mxn )* to an m x 1 vector. We will be keeping in mind our assumption 
that m < I < n. 

Constructing a matrix P mxm such that the condition number of P^ 1 A is not too large 
incurs the following costs: 

1. Constructing S defined in (12) costs I • (Ca + 0{n)). 

2. Constructing R and II satisfying (13) costs 0(1 ■ m 2 ). 

Given R and II (whose product is P*), constructing the matrix Y mxm defined in (14) 
incurs the following costs: 

1. Constructing P~ defined in (15) costs 0(m 3 ). 

2. Constructing X via the five-step procedure (Steps a-e) costs m • {Ca* + Ca + 0(m 2 )). 

3. Constructing Y in (16) costs C(m 3 ). 

Summing up, we see from m < I < n that constructing R, II, and Y defined in (14) costs 

C prc = (Z + m) ■ C A + m ■ C A * + 0(1 ■ (m 2 + n)) (18) 

floating-point operations overall, where Ca is the cost of applying A mxn to an n x 1 vector, 
and Ca* is the cost of applying (A mxn )* to an m x 1 vector (/ = m + 4 is often a good 
choice). Given R and II (whose product is P*) and Y, computing the orthogonal projection 
b — b of a vector b onto the null space of A via the seven-step procedure above costs 

C pio = C A + 0(m 2 ) + C A . (19) 

floating-point operations, where again Ca is the cost of applying A mxn to an n x 1 vector, 
and Ca* is the cost of applying (A mxn )* to an m x 1 vector. 

Remark 4.3. We may use iterative refinement to improve the accuracy of h discussed 
in Remark 4.1, as in Section 5.7.3 of [4]. Each additional iteration of improvement costs 
CA + 0(m 2 ) + Ca* (since R, IT, and Y are already available). We have also found that repro- 
jecting the computed projection onto the null space often increases the accuracy to which A 
annihilates the projection. However, this extra accuracy is not necessarily meaningful; see 
Remark 5.1 below. 
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5 Numerical results 



In this section, we illustrate the performance of the algorithm of the present paper via several 
numerical examples. 

Tables 1-4 display the results of applying the algorithm to project onto the null space of 
the sparse matrix A mxn defined as follows. First, we define a circulant matrix B mxm via the 
formula 

1, j = k — 2 mod m 
—4, j — k — 1 mod m 

B jk =< Q + d : j 1 , , (20) 
-4 5 j — k + 1 mod m v ' 

1, j = k + 2 mod m 

0, otherwise 

for j, k — 1, 2, . . . , m — 1, m, where <i is a real number that we will set shortly. Taking n to 
be a positive integer multiple of m, we define the matrix A mxn via the formula 

A m xn = U mxm ■ ( B mxm | B mxm | • • • | B mxm I B mxm ) mxn • V nxn , (21) 

where U mxm and V^ xn are drawn uniformly at random from the sets ofmxm and n x n 
permutation matrices, and B mxm is defined in (20). The condition number of B is easily 
calculated to be k = (16 + <i)/<i; therefore, the condition number of A is also k = (16 + d)/d. 
In accordance with this, we chose d = 16/(/t — 1). 

For Tables 1 and 2, we took the entries of G in (12) to be (Mitchell-Moore-Brent-Knuth) 
lagged Fibonacci pseudorandom sequences, uniformly distributed on [—1,1], constructed 
using solely floating-point additions and subtractions (with no integer arithmetic); see, for 
example, Section 7.1.5 of [9]. For Tables 3 and 4, we took the entries of G in (12) to be 
high-quality pseudorandom numbers drawn from a Gaussian distribution of zero mean and 
unit variance. Though we used Gaussian variates in previous sections in order to simplify the 
theoretical analysis, there appears to be no practical advantage to using high-quality truly 
Gaussian pseudorandom numbers; as reflected in the tables, the very quickly generated 
lagged Fibonacci sequences perform just as well in our algorithm. 

Tables 5 and 6 display the results of applying the algorithm to project onto the null space 
of the dense matrix A mxn defined via the formula 

A mxn A mxn 

+ E mxl0 ■ F 10xn /Vrrm, (22) 

where A mxn is the sparse matrix defined in (21), and the entries of E mx i and F Wxn are 
i.i.d. Gaussian random variables of zero mean and unit variance. As for Tables 1-4, we 
chose d = 16/(k, — 1) for use in (20); with this choice, k is the condition number of A 
in (22), and provides a rough estimate of the condition number of A. We took advantage 
of the special structure of A (as the sum of a sparse matrix and a rank- 10 matrix) in order 
to accelerate the application of A and its adjoint to vectors. For Tables 5 and 6 (just as 
in Tables 1 and 2), we took the entries of G in (12) to be (Mitchell-Moore-Brent-Knuth) 
lagged Fibonacci pseudorandom sequences, uniformly distributed on [—1,1], constructed 
using solely floating-point additions and subtractions (with no integer arithmetic); see, for 
example, Section 7.1.5 of [9]. 
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The following list describes the headings of the tables: 

m is the number of rows in the matrix for which we are computing the orthogonal 
projection onto the null space. 

n is the number of columns in the matrix for which we are computing the orthogonal 
projection onto the null space. 

/ is the number of columns in the random matrix G from (12). 
k is the condition number of the matrix A defined in (21). 

s pre is the time in seconds required to compute A A* (or A A*, for Table 5) and its 
pivoted QR decomposition. 

s pro is the time in seconds required to compute directly the orthogonal projection 
b - A* (A A*)' 1 Ab (or b - A* (A A*)' 1 Ab, for Table^5) for a vector b, having already 
computed a pivoted QR decomposition of A A* (or A A*, for Table 5). Thus, s prc +j-s pro 
is the time required by a standard method for orthogonally projecting j vectors onto 
the null space. 

t pre is the time in seconds required to construct the matrices R, II, and Y defined in 
Section 4. 

t pro is the time in seconds required by the randomized algorithm to compute the or- 
thogonal projection of a vector onto the null space, having already computed the 
matrices R, II, and Y defined in Section 4. Thus, t pre + j ■ t pro is the time required by 
the algorithm of the present paper for orthogonally projecting j vectors onto the null 
space. 

^norm is a measure of the error of a standard method for orthogonally projecting onto 
the null space. Specifically, 5 norm is the Euclidean norm of the result of applying A 
(or A, for Table 6) to the orthogonal projection (onto the null space) of a random 
unit vector b, computed as b - A* (A A*)" 1 A b or b - A* (A A*)- 1 Ab. (In Tables 2, 4, 
and 6, 5 norm is the maximum encountered over 100 independent realizations of the 
random vector b.) The tables report <5 norm divided by the condition number, since this 
is generally more informative (see Remark 5.1 below). 

e norm is a measure of the error of a standard method for orthogonally projecting onto 
the null space. Specifically, e novm is the Euclidean norm of the difference between the 
orthogonal projection (onto the null space) of a random unit vector b (computed as 
b—A* (A A*)^ 1 Ab or b — A* (A A*)^ 1 A b) and its own orthogonal projection (computed 
as b - A* (A A*)- 1 Ab or b - A* (A A*)- 1 Ab), where b = b - A* (A A*)- 1 Ab or b = 
b — A*(AA*)~ 1 Ab (as computed). (In Tables 2, 4, and 6, £ norm is the maximum 
encountered over 100 independent realizations of the random vector b.) The tables 
report e n0 rm divided by the condition number, since this is generally more informative 
(see Remark 5.1 below). 
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Table 1: Timings for the sparse matrix A defined in (21) 



m 


n 


I 


K 


^pre 


^pro 


^pre 


^pro 


40 


1000000 


AA 


1E8 


.23E2 


.91E0 


.35E2 


.90E0 


400 


1000000 

-L \J \J \J \J \J \J 


404 


1E8 


.23E3 


.90E0 


.35E3 


.90E0 


4000 


1000000 


4004 


1E8 


.25E4 


.10E1 


.38E4 


.12E1 


1000 


3000 


1004 


1E8 


.29E1 


.85E-2 


.19E2 


.15E-1 


1000 


30000 

'J V 7 V 7 V 7 V 7 


1004 


1E8 


.69E1 


.16E-1 


.25E2 


.22E-1 


1000 


300000 

'J V 7 V 7 V 7 V 7 V 7 


1004 


1E8 


.13E3 


.21E0 


.20E3 


.21E0 


1000 


3000000 

KJ V7 V7 V7 V7 \J \J 


1004 


1E8 


.19E4 


.30E1 


.28E4 


.30E1 


onnn 
zuuu 


1 nnnnnn 
ruuuuuu 


onnn 
zuuu 


1 "FT8 


1 9TTM 

. 1ZIL7± 


Q9"Rn 

.yzri/U 


1 QT7M 


.yori/U 


2000 


1000000 


4000 


1E8 


.12E4 


.93E0 


.25E4 


.98E0 


2000 


1000000 


8000 


1E8 


.12E4 


.95E0 


.36E4 


.98E0 


4000 


1000000 


4004 


1E4 


.26E4 


.11E1 


.39E4 


.12E1 


4000 


1000000 


4004 


1E6 


.25E4 


.10E1 


.39E4 


.12E1 


4000 


1000000 


4004 


1E8 


.25E4 


.10E1 


.38E4 


.12E1 


4000 


1000000 


4004 


1E10 


.25E4 


.11E1 


.38E4 


.12E1 


4000 


1000000 


4004 


1E12 


.25E4 


.10E1 


.39E4 


.12E1 



• ^rand is a measure of the error of the algorithm of the present paper. Specifically, <5 ran d 
is the Euclidean norm of the result of applying A (or A, for Table 6) to the orthogonal 
projection (onto the null space) of a random unit vector b (with the projection com- 
puted via the randomized algorithm). (In Tables 2, 4, and 6, 5 ran( j is the maximum 
encountered over 100 independent realizations of the random vector b.) The tables 
report 5 ran( i divided by the condition number, since this is generally more informative 
(see Remark 5.1 below). 

• £rand is a measure of the error of the algorithm of the present paper. Specifically, e ran d is 
the Euclidean norm of the difference between the orthogonal projection (onto the null 
space) of a random unit vector b (with the projection computed via the randomized 
algorithm), and the projection of the projection of b (also computed via the algorithm 
of the present paper). (In Tables 2, 4, and 6, £ ran d is the maximum encountered over 
100 independent realizations of the random vector b.) The tables report e ran d divided 
by the condition number, since this is generally more informative (see Remark 5.1 
below). 

Remark 5.1. Standard perturbation theory shows that, if we add to entries of the matrix A 
random numbers of size 5, and the unit vector b has a substantial projection onto the null 
space of A, then the entries of the vector h minimizing the Euclidean norm \\A* h — 6|| 
change by about 5 ■ C 2 , where C is the condition number of A (see, for example, formula 
1.4.26 in [2]). Thus, it is generally meaningless to compute h more accurately than e ■ C 2 , 
where e is the machine precision, and C is the condition number. Similarly, it is generally 
meaningless to compute the orthogonal projection onto the null space of A more accurately 
than e ■ C, where e is the machine precision, and C is the condition number of A. 



10 



Table 2: Errors for the sparse matrix A defined in (21) 



771 


Tl 




K 


A Ik 
"norm/ ™ 


C~ 1 L? 

c norm/ 


A , Ik 
Orand/ ro 


C~ i / L? 

c rand/ ™ 


40 


1000000 


A A 

44 


1E8 


HAT^ 1 A 

.20E-14 


.21E-13 


.20E-15 


.61E-14 


a nn 
4UU 


i nnnnnn 
1UUUUUU 


Ar\A 
4U4 


iri/o 


OQ"P 1 A 

.Aoej— 14 


.0 i Ej— 1 1 


A ST? 1 ^ 


1 1 "CP 1/1 

. i iri/— 14 


4UUU 


1000000 


A nn A 

4004 


lEo 


OCT 1/1 

.25E-14 


1 pit 1 no 

.loE-08 


1 ^717 1 /I 

.1 fE-14 


O A T7 1 1 C 

.24E-15 


1000 


onnn 

3000 


1004 


1E8 


.17E-14 


.20E-08 


.85E-15 


.12E-15 


1000 


o r\r\r\r\ 

30000 


1004 


1E8 


.27E-13 


.54E-06 


.11E-14 


.95E-16 


1000 


300000 


1004 


1E8 


.15E-14 


.16E-09 


.76E-15 


.35E-15 


1000 


3000000 


1004 


1E8 


.22E-14 


.57E-11 


.11E-14 


.96E-15 


2000 


1000000 


2000 


1E8 


.23E-14 


.32E-09 


.12E-14 


.33E-15 


2000 


1000000 


4000 


1E8 


.23E-14 


.32E-09 


.80E-15 


.17E-15 


2000 


1000000 


8000 


1E8 


.23E-14 


.32E-09 


.75E-15 


.17E-15 


4000 


1000000 


4004 


1E4 


.22E-13 


.79E-12 


.59E-14 


.69E-15 


4000 


1000000 


4004 


1E6 


.46E-14 


.15E-10 


.32E-14 


.38E-15 


4000 


1000000 


4004 


1E8 


.25E-14 


.16E-08 


.17E-14 


.24E-15 


4000 


1000000 


4004 


1E10 


.37E-17 


.13E-12 


.89E-15 


.15E-15 


4000 


1000000 


4004 


1E12 


.59E-19 


.13E-14 


.45E-15 


.13E-15 






Table 3: Timings using normal variates 






m 


n 


/ 


K 


Sprc 




^pre 


^pro 


4000 


1000000 


4004 


1E4 


.26E4 


.11E1 


.54E4 


.12E1 


4000 


1000000 


4004 


1E6 


.25E4 


.10E1 


.54E4 


.12E1 


4000 


1000000 


4004 


1E8 


.25E4 


.10E1 


.54E4 


.12E1 


4000 


1000000 


4004 


1E10 


.25E4 


.11E1 


.56E4 


.13E1 


4000 


1000000 


4004 


1E12 


.25E4 


.10E1 


.54E4 


.12E1 






Table 4: Errors 


using normal variates 






m 


n 


/ 


K 


^norm/ ^ 


^norm/^ 


<5rand/ K> 


^rand/'t 


4000 


1000000 


4004 


1E4 


.22E-13 


.79E-12 


.59E-14 


.68E-15 


4000 


1000000 


4004 


1E6 


.46E-14 


.15E-10 


.36E-14 


.38E-15 


4000 


1000000 


4004 


1E8 


.25E-14 


.16E-08 


.20E-14 


.23E-15 


4000 


1000000 


4004 


1E10 


.37E-17 


.13E-12 


.95E-15 


.15E-15 


4000 


1000000 


4004 


1E12 


.59E-19 


.13E-14 


.43E-15 


.14E-15 
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Table 5: Timings for the dense matrix A defined in (22) 



m 


n 


I 


K 


Spre 


Spvo 




^-pro 


40 


1000000 


44 


1E8 


.28E2 


.11E1 


.42E2 


.11E1 


400 


1000000 


404 


1E8 


.28E3 


.11E1 


.41E3 


.11E1 


4UUU 


1000000 


4004 


1E8 


.30E4 


.12E1 


.0brlv4 


. ioii/1 


1000 


3000 


1004 


1E8 


.31E1 


.89E-2 


.19E2 


.15E-1 


1000 


30000 


1004 


1E8 


.89E1 


.19E-1 


.28E2 


.25E-1 


1000 


300000 


1004 


1E8 


.16E3 


.25E0 


.25E3 


.26E0 


1000 


3000000 


1004 


1E8 


.23E4 


.34E1 


.33E4 


.34E1 




Table 6: Errors for the dense matrix A defined in 


(22) 




771 


n 


I 


K 


^norm / ^ 




A , Ik 
°rand/ ™ 


C" / L? 

c rand/ ™ 


40 


1000000 


AA 


1E8 


.19E-16 


.48E-16 


.34E-18 


.15E-19 


400 


1000000 


404 


1E8 


.42E-16 


.32E-14 


.68E-17 


.31E-18 


4000 


1000000 


4004 


1E8 


.72E-14 


.11E-08 


.14E-14 


.27E-16 


1000 


3000 


1004 


1E8 


.48E-15 


.12E-10 


.28E-15 


.52E-16 


1000 


30000 


1004 


1E8 


.38E-15 


.12E-10 


.25E-15 


.17E-16 


1000 


300000 


1004 


1E8 


.65E-15 


.17E-11 


.30E-15 


.13E-16 


1000 


3000000 


1004 


1E8 


.14E-14 


.22E-11 


.28E-15 


.57E-16 



We performed all computations using IEEE standard double-precision variables, whose 
mantissas have approximately one bit of precision less than 16 digits (so that the relative 
precision of the variables is approximately .2E-15). We ran all computations on one core of a 
1.86 GHz Intel Centrino Core Duo microprocessor with 2 MB of L2 cache and 1 GB of RAM. 
We compiled the Fortran 77 code using the Lahey/Fujitsu Linux Express v6.2 compiler, with 
the optimization flag — o2 enabled. We used plane (Householder) reflections to compute all 
pivoted QR decompositions (see, for example, Chapter 5 of [7]). 

Remark 5.2. The numerical results reported here and our further experiments indicate 
that the timings of the classical scheme and the randomized method are similar, whereas 
the randomized method produces more accurate projections (specifically, the randomized 
projections are idempotent to higher precision). The quality of the pseudorandom numbers 
does not affect the accuracy of the algorithm much, if at all, nor does replacing the normal 
variates with uniform variates for the entries of G in (12). 
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